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BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

The invention relates to the field of digital image processing and in particular to characterizing 
video content. 

5 

2. Related Ait 

US Pat. No. 5,179,832 teaches using some types of data in the video data stream to find scene 
changes. Scene changes are determined by comparison of data in consecutive frames. 

10 SUMMARY OF THE INVENTION 

The object of the invention is to create a useful characterization of video 

content. 

This object is achieved by extracting key frames, grouping the key frames into 
families and creating a family representation. Preferably, the family representation is in the 
15 form of a histogram. 

The family representation can be used to distinguish program boundaries, index 
tapes, identify program material, edit video content, or search video content for instance as 
part of a web search engine. 

20 BRIEF DESCRIPTION OF THE DRAWING 

Fig. 1 is a high level view of the system in accordance with the invention. 

Fig. 2 is a flow chart showing operation of the invention. 

Fig. 3 shows grouping of video material into key frames, families, and 

25 programs. 
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PET ATT JED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS. 

Fig. 1 shows a system including the invention. A processor 103 is hooked to a 
user output device 101, a user input device 102, and a memory device 104. The memory 
5 stores digital video frame data and data abstracted from frames. The processor may optionally 
be connected via a communication link 106 and a network 105 to some remote device (not 
shown). The communication link may be wired or wireless and may lead to the remote device 
in some way other than through a network. The network may be the internet. 

Fig. 2 shows a flow chart describing operation of the preferred embodiment of 

10 invention. 

In box 201, a key frame is located and its frame number stored. The frame 
number associated with each key frame must be retained throughout the procedure. 
Identification of key frames can be done in accordance with US patent application serial 
numbers 08/867,140 and 08/867,145, which are incorporated herein by reference. 

15 In box 202, a histogram is derived for the key frame. Deriving histograms from 

video frames is described in R. C. Gonzalez and R. E. Woods, Digital Image Processing , 
(Addison Wesley 1992) pp. 235-247, Basically every image is described by a number of 
colors, called a palette. The number of colors in the palette can be chosen in accordance with 
the application. The histogram gives a numerical value for each color. The numerical value is 

20 the number of pixels with that color in the key frame. Each histogram must be associated with 
a frame number. 

For speed of processing, it is preferred, for the purposes of the invention, to 
choose a less than exhaustive palette. Colors such as black, white, and skin tones, which are 
likely to appear in all frames, are preferably not chosen for the palette, or are given a low 
25 weight. In the less than exhaustive palette, each chosen color is referred to as a "bin." For 
instance, the following bins might be used: 
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This set of bins is chosen according to a commonly used color space, but any color space 
might be used, e.g., HSV, YUV, RGB. More information about histogram formation is to be 

5 found in R. C. Gonzalez and RE Woods, Digital Image Processing . (Addison Wesley 1992) 
pp. 171-182. The set given above includes eight bins, but other numbers of bins might be used. 
Also, other definitions of the bins might be used, according to what the designer finds most 
effective. The different bins may be assigned different weights according to how important 
they are considered to be, so the weights may be considered an additional column in the above 

10 table. 

After a histogram is defined for the key frame, based on the color of bins, the 
key frame is compared with any stored key frames at 203. This comparison is in accordance 
with the following formula: 

difference(H n H„) = £< W| * v, *|i/,0> J!f M 0l> < Threshold (1) 
J 

15 In this formula, the variable Hi represents the histogram associated with the key frame of index 
i. The vector value Hi(j) is the numerical value of this histogram associated with bin of index 
. j. The variable Threshold represents a value below which two histograms are to be considered 
similar. 

Hie variable vj is a weight associated with the bin of index j. If all bins are 
20 considered to have the same weight, then this bin weight can be omitted from formula 1. 

The variable wj represents a weight to be assigned to a key frame of index i. It — 

should be noted that key frames associated with scenes of extremely short duration should be 
given low weight, as these scenes are probably not significant to the program -and may even 
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represent commercials. Thus very short scenes should probably not be considered as being 
different from the scenes that precede them and should not be grounds for starting a new 
family, per the following step. 

If the short scenes are never to be used for any purpose, it may be more efficient 
5 simply to test for scene length after box 201; and, if scene length is insufficient, control could 
be returned to box 201, without proceeding to the rest of the flow chart. 

Formula (1) is only one of many ways of calculating a distance between 
histograms. Other ways of calculating such distances are discussed in S. Antani et al., Pattern 
Recognition Methods in Image and Video Databases: Past. Present and Future , Proceedings 7 th 
10 International Workshop on Structural and Syntactic Pattern Recognition and 2 nd International 
Workshop on Statistical Techniques in Pattern Recognition (Aug.. 1998) Sydney; Australia, 
pre-published on the internet at http://machine visionxse.psu.edu/-antani/pubs.html on 7/9/98 

At 205 there is a branch. If the difference calculated according to formula (1) is 
less than Threshold, then the key frame represented by the current histogram is to be grouped 
15 into a stored family at 204. If the difference is greater than the stored threshold, a new family 
is formed at 206. New families are formed at 206 or histograms grouped into current families, 
only if the duration of the scene associated with the current histogram is considered sufficient, 
e.g., more than one minute. 

At 206, a new family is formed. This family is a data structure which includes 
20 a family histogram, a total duration, and pointers to the constituent histograms and frame 
numbers. Each family histogram is formed by a data structure with 
pointers to each of the constituent histograms and frame numbers; 
a family histogram, initialized to the current histogram; and 
a variable representing total duration, which is initialized to the duration of the scene 
25 represented by the current histogram. 

At 204, family histograms are grouped according to the following formula: 



dur x 



•*i(0 (2) 



i [total _du rpm 



J 



In this formula 



• is a variable representing the bin number, 
fam is an index representing this particular family; 
30 Hfam is a vector representing the family histogram; 
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i is an index representing the scene number. This index starts at 1 for the first scene added to 
the family and runs through the number of scenes in the family, including the current one 
being added. 

dun is a variable representing the duration of scene i. This duration is obtained by subtracting 
the frame number corresponding to the following key frame i+1 from the frame number of the 
current key frame i; 

Hi(») is the numerical value indicating the number of pixels in bin • for key frame number i; 
and 

totaLdurfam is a variable representing the total duration of all scenes already in the family. 
This variable must be updated by adding the duration of the scene associated with the current 
histogram, prior to making the calculation of formula (2). 

At 207 it is tested whether all key frames have been found If not, control 
returns to box 201. If so, each family is now represented by a single histogram, which is used 
for comparison purposes when new key frames are detected in a stream of video content. It 
would be expected that most half hour programs could be represented by about 3 families of 
histograms, though more or less could be used depending on the programming in question. A 
longer show might need more. Families of key frames characterizing a program can be built 
on the fly in accordance with the invention. Alternatively, families of key frames might be 
pre-stored corresponding to existing programs that a user might want to identify. 

Once families are created, they can be used for numerous purposes. One is that 
the families can be used to distinguish program boundaries. 

Fig. 3 shows video information grouped into key frames, represented by 
histograms. In this example, the histograms are called Hi , H 2 , H 3 , H4, H 5 , H6, H 7 , and Hg. 
Real video information could have more or less key frames. Certain of the histograms are 
grouped into families. For instance, Hi and H4 are both in a family represented by histogram 
Hi, 4 ; H 3 and H 5 are both in a family represented by histogram H 3t5 ; and H 7 and Hg are both in a 
family represented by histogram H 7 ,g. The histograms H 2 and are categorized as 
"disruptive." In other words, the video duration associated with those key frames is so short 
that they are not considered useful in identifying programming. Accordingly, their weights w 2 
and w 6 in equation (1) will be low AND they will not be put into families according to boxes 
204 and 206 of Fig. 2. The formation of these families is in accordance with boxes 204, 205 
and 206 in Fig. 2. - - ....... 
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The indices illustrated in the figures are purely examples. More or less indices 
of different values could be used by those of ordinary skill in the art depending on the actual 
video content. 

A program boundary is placed between H 7 and H6 in accordance with box 209 

5 of Fig. 2. 

The family histograms Hi, 4 and H 3 ^ could then constitute a characterization of 
the first program. These family histograms could be stored for searching for programming on 
a database or on a VCR 

Another possible use of the families of histograms is to compare programs. 
10 Boundaries of programs could be found in accordance with Fig. 3. 

Alternatively, program boundaries could be determined manually. An algorithm for finding 
program boundaries follows: 

******* 

Let Fi be FamlyGroup i in the list of FamilyGroups F. n is the size of the list of FamilyGroups 

15 

MIN(Fi) is the minimum keyframeNr in the FamilyGroup Fj. 
MAX(Fj) is the maximum keyframeNr in the FamilyGroup Fj. 



Algorithm 

20 

Assumption: F is sorted on MIN(Fi) 

FOR (i=0; i < n-1; i++) 
DO 
25 j=i+l; 

IFCCMAXCFiXCMINCFj)) 

THEN 

BoundaryBegin = MAX(MIN(Fj) f BoundaryBegin) 
BoundaryEnd = MAX(MAX(Fj), BoundaryEnd) 
30 ELSE 

BoundaryBegin =MAX(MIN(Fj), BoundaryBegin) 
BoundaryEnd = MAX(MAX(Fi), BoundaryEnd) 
ENDIF 

IF (BoundaryEnd < BoundaryBegin) 
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THEN 

PRINT 'Found a BOUNDARY between BoundaryEnd and BoundaryBegin' 
ENDIF 
DONE 

PRINT 'BOUNDARY at n (end of video)* 

************* 

After program boundaries and families are determined, let us say that program 
S is characterized by four family histograms 

SH = (SH lt SH 2 ,SH 3 ,SH A ) 
and let us say that program B is characterized by four family histograms 

BH = (BH lt BH 2 , BH 3 , BH 4 ) 
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Then the difference between two programs can be calculated according to the following 
formula 

Diff(SH,BH)= $^min D^H„BHjf[ (3) 

Where 

D(tf„ff,)=£|//,(*)-tf,(*l (4) 

K 

and where Hj(k) is the value in bin k of histogram B\. Alternatively, weights could be applied 
5 to certain families if those were considered particularly important for characterizing the 
program, in which case the function D could be defined according to the following formula. 

(5) 

f 

The differences between programs could be used in searching databases of 
video content or the world wide web. Alternatively, the differences could be used for 
marking programming suspected to not be what it is supposed to be, e.g., if pre-emptive 

10 programming were inserted into a video stream rather than scheduled programming. In such a 
case, the user might want to skip or pay particular attention to pre-emptive programming. 
Program comparison might even be used to identify unwanted programming for deletion from 
a database of video content 

In characterizing a series of programs, it may be useful to develop super- 

15 histograms for the series. In such a case, formula (2) can be used to combine scenes from 
several programs from the series to result in families that characterize all of the several 
programs. Such super-histograms could then be used to search for programs of that series in a 
database or stream of video content. 
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1. A digital data processing method for characterizing video content, the method 
comprising executing the following operations in a digital data processing device (103): 

a. extracting (201) key frames from the video content, each respective key frame 

representing a respective scene in the video content; 
5 b. grouping (204, 206) )at least some of the key frames into at least one family of 

key frames; 

c. establishing a respective family representation for the family; 

d. embodying at least one such family representation in a storage medium to yield 
a characterization of the video content, the storage medium being readable by the digital data 

10 processing device. 

2. The method of claim 1 

a. further comprising the step of collapsing (200) each respective key frame into a 

plurality of bins to yield a respective histogram for each key frame; 
15 b. wherein the grouping operation comprises grouping significant key frames 

based on a comparison (203, 205) of the respective histograms; 

c. wherein the family representation is a family histogram; and 

d. wherein the characterization comprises at least one such family histogram. 

20 3. The method of claim 2 wherein the grouping operation further comprises 

a. comparing (203) each new key frame in a series with at least one family 
histogram to yield a difference measurement; and 

b. if the difference measurement exceeds a predetermined threshold value 
. determining (206) that the new key frame does not fall within the at least one family. 

25 

4. A storage medium (104) readable by the digital data processing device 

embodying a characterization of video content produced according to the method of claim 1. 
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5. A digital data processing device (103) comprising the storage medium (104) of 
claim 4 and a digital data processor for processing data stored on the storage medium. 

6. The digital data processing device of claim 5 wherein the digital data processor 
5 is adapted to index video content based on the characterization. 

7. The digital data processing device of claim 5 wherein the digital data processor 
is adapted to search for video content based on the characterization. 

10 8. The digital data processing device of claim 5 wherein the digital data processor 

is adapted to determine program boundaries based on the video content. . 

9. The digital data processing device of claim 5 comprising a communication link 
(106) adapted to allow browsing of the medium from a remote device. 

15 

10. The digital data processing device of claim 9 wherein the communication link 
(106) comprises an internet connection. 
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